Instructing a Reinforcement Learner

ثبت نشده
چکیده

In reinforcement learning (RL), rewards have been considered the most important channel for understanding an environment’s dynamics and have been very effectively used as a feedback mechanism. However, recently there have been interesting forays into other modes of understanding the environment. Using sporadic supervisory inputs is one such alternative. This brings into the learning process rich information about the world of interest. In this paper, we model these supervisory inputs as instructions, provide a mathematical formulation for the same and propose a framework to incorporate them into the learning process.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Instructing a Reinforcement Learner

In reinforcement learning (RL), rewards have been considered the most important feedback in understanding the environment. However, recently there have been interesting forays into other modes such as using sporadic supervisory inputs. This brings into the learning process richer information about the world of interest. In this paper, we model these supervisory inputs as specific types of instr...

متن کامل

Guiding a Reinforcement Learner with Natural Language Advice: Initial Results in RoboCup Soccer

We describe our current efforts towards creating a reinforcement learner that learns both from reinforcements provided by its environment and from human-generated advice. Our research involves two complementary components: (a) mapping advice expressed in English to a formal advice language and (b) using advice expressed in a formal notation in a reinforcement learner. We use a subtask of the ch...

متن کامل

Introducing interactive help for reinforcement learners

The reinforcement learning problem is a very difficult problem when considering real-size applications. To solve it, we think that many issues should be studied altogether. To achieve such an endeavor, we also think that it is quite common that human begins can provide help on-the-fly to the reinforcement learner, that is when he/she sees how the learner is (mis)behaving, or could perform bette...

متن کامل

Inverse Reinforcement Learning Under Noisy Observations (Extended Abstract)

We consider the problem of performing inverse reinforcement learning when the trajectory of the expert is not perfectly observed by the learner. Instead, noisy observations of the trajectory are available. We generalize the previous method of expectation-maximization for inverse reinforcement learning, which allows the trajectory of the expert to be partially hidden from the learner, to incorpo...

متن کامل

Agnostic KWIK learning and efficient approximate reinforcement learning

A popular approach in reinforcement learning is to use a model-based algorithm, i.e., an algorithm that utilizes a model learner to learn an approximate model to the environment. It has been shown that such a model-based learner is efficient if the model learner is efficient in the so-called “knows what it knows” (KWIK) framework. A major limitation of the standard KWIK framework is that, by it...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011